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Abstract 

Severe acute respiratory syndrome (SARS) is an acute respiratory illness, which has broken out in China. It has been known that 
SARS coronavirus (SARS_CoV) is a novel human coronavirus and is responsible for SARS infection. Belonging to one of the major 
proteins associated with SARS_CoV, SARS 3C-like protease (SARS_3CL pro ) functions as a cysteine protease engaging in the 
proteolytic cleavage of the viral precursor polyprotein to a series of functional proteins required for coronavirus replication and is 
considered as an appealing target for designing anti-SARS agents. To facilitate the studies regarding the functions and structures of 
SARS_3CL pro , in this report the synthetic genes encoding 3CL pro of SARS_CoV were assembled, and the plasmid was constructed 
using pQE30 as vector and expressed in Escherichia coli Ml5 cells. The highly yielded (~15 mg/L) expressed protease was purified by 
use of NTA-Ni 2+ affinity chromatography and FPLC system, and its sequence was determined by LC/MS with the residue coverage 
of 46.4%. 

© 2003 Elsevier Inc. All rights reserved. 
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From the end of the year 2002 to the June of the 
year 2003, one severe epidemic disease called severe 
acute respiratory syndrome (SARS) broke out severely 
in China, and SARS infection has also spread to more 
than 30 countries. By using biophysical and biochem¬ 
ical techniques such as electron microscopy, virus-dis¬ 
covery microarrays containing conserved nucleotide 
sequences characteristic of many virus families, ran¬ 
domly primed RT-PCR, and serological tests, it has 
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been determined that SARS coronavirus (SARS_CoV) 
is responsible for SARS infection [1-3]. Coronavirus is 
a positive-stranded RNA virus with halo or corona 
appearance if viewed under a microscope and involves 
the largest viral RNA genomes known to date. The 
studies have suggested that SARS_CoV is a previously 
unknown coronavirus, which belongs neither to a 
mutant of any known coronavirus nor a recombinant 
of known coronaviruses; it is believed to be a novel 
human coronovirus possibly originated from a non¬ 
human host [4,5]. 

Proteolytic processing of viral polyproteins is a key 
step in the replication cycle of many positive-strand 
RNA viruses and such processing is performed by the 
encoded proteases [6,7]. It has been known that the 
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replicase gene for encoding the proteins required for 
coronavirus replication and transcription encompasses 
more than 20,000 nucleotides [8,9] and encodes two 
overlapping polyproteins, ppla (replicase la, around 
450kDa) and pplab (replicase lab, around 750kDa), 
which feature the sequence motifs of both papain-like 
cysteine protease and the 3C-like protease (3CL pro ) 
[10,11]. Recently, the genome sequencings deposited in 
the GenBank (http://www.ncbi.nlm.nih.gov/) for the 
SARS_CoV from different SARS patients have laid a 
potent foundation for the research of SARS patho¬ 
genesis and anti-SARS drug design [12-14]. The fact 
has been demonstrated that the important proteins 
associated with the SARS_CoV infection involve the 
RNA polymerase, the spike (S) glycoprotein, the en¬ 
velope (E) protein, the membrane (M) protein, the 
nucleocapsid (N) protein, and the main protease, 3C- 
like (3CL pro ) protease [15,16]. As the viral main pro¬ 
tease, 3CL pro functions as a protease to control the 
activities of the coronavirus replication complex 
[17,18]. 

It has been concluded from the previous research 
data that the 3CL pro -mediated processing pathways 
are conserved in coronaviruses. Coronavirus main 
proteases employ conserved cysteine and histidine 
residues in the catalytic site and lack acidic active site 
residue [6,19-21]. The results have also confirmed that 
for coronavirus main proteases their substrate speci¬ 
ficities are also well defined, with the known cleavage 
sites involving bulky hydrophobic residues (mainly 
leucine) at the P2 position, glutamine at the PI posi¬ 
tion, and small aliphatic residues at the PE position 
[17,18]. In addition, the recent determination of the 
crystal structures for human coronavirus (strain 229E) 
3CL pro and for an inhibitor complex of porcine co¬ 
ronavirus (transmissible gastroenteritis virus, TGEV) 
3CL pro also confirms a remarkable degree of conser¬ 
vation of the substrate binding sites for coronavirus 
3CL pro [16]. In fact, the studies have already shown 
that 3CL pro is a useful target for screening anti-virus 
agents [18,22,23]. Like other 3CL pro , it is hopeful that 
SARS_3CL pro will surely become an appealing target 
in discovering new agents for the treatment of SARS 
[16] . 

Therefore, based on the aforementioned facts, it 
seems to be very important to express and purify large 
amount of the SARS_3CL pro for its structural and 
functional research purposes. In our previous work 
[24,25], we reported a 3D model of SARS_3CL pro and 
its inhibitor design by virtual screening, as well as the 
cloning, expression, and purification of the E protein of 
SARS_CoV. In this article, we would like to present the 
results describing the molecular cloning, expression and 
purification of 3CL pro of SARS_CoV, and the pre¬ 
liminary study on its mass spectrometric characteriza¬ 
tion is also reported. 


Materials and methods 

Chemicals, enzymes, and the vector pQE30 

The restriction and modifying enzymes in this work 
were purchased from TaKaRa and the vector pQE30, 
the bacterial strains M15 and DH5oe were from Qiagen. 
Trizol and Superscript II reverse transcriptase were 
purchased from Gibco. Trypsin (sequencing grade) was 
purchased from Sigma. The chelating affinity column 
and lower molecular weight (LMW) marker were pur¬ 
chased from Amersham-Pharmacia Biotech. All other 
chemicals were from Sigma in analytical grade. 

Bacterial strains and culture media 

Escherichia coli DH5oe was utilized for propagation of 
plasmids. DH5oc was maintained on LB agar plates and 
grown at 37°C, while M15 was cultured on LB agar 
plates containing kanamycin (25mg/L). For agar plates, 
Bacto agar was added to the media to a final concen¬ 
tration of 1.5% (w/v). Ampicillin was added to the media 
at a final concentration of lOOmg/L for the selection of 
transformants. E. coli Ml5 was chosen as the host for 
gene expression. The strains were maintained in LB 
medium including 15% glycerol at -80°C. Ampicillin 
and kanamycin as antibiotics were added to the media at 
a final concentration of 100 and 25mg/L, respectively. 

Cloning of SARS_ 3CL pro gene in pQE30 

All cloning techniques including PCR, restriction 
digestion, ligation, E. coli transformation, and plasmid 
DNA preparation were according to the literature 
method [26]. 

SARS_CoV (isolate BJ01) RNA was extracted with 
Trizol reagent according to manufacturer’s instruction 
(www.genehub.net/trizol.htm). The reverse transcription 
was performed with the random priming method by the 
Superscript II reverse transcriptase. The SARS_3CL pro 
cDNA was subsequently amplified by PCR, using 
the following primers: 3CLf (5'-GGGGGATCCACCA 
TGAGTGGTTTTAGGAAAATGGCA-3') and 3CLr 
(5 / -GGGAAGCTTTTGGAAGGTAACACCAGAGC 
A-3'). After digestion with BamHl and Hindlll, the PCR 
product was inserted into the BamHl and Hindlll sites of 
the vector pQE30 (Qiagen). The residues in the expression 
tag are “MRGSHHHHHHGSTM”. The SARS_3CL pro 
insert was verified by sequencing. 

Expression and purification of SARS_3 CL pro 

Escherichia coli M15 cells transformed with the 
plasmid pQE30-SARS_3CL pro were grown in 100 ml LB 
medium containing ampicillin (lOOmg/L) and kana¬ 
mycin (25 mg/L) at 37°C overnight and then inoculated 
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into 1 L LB supplemented with both the antibiotics. The 
expression of SARS_3CL pro was induced by the addition 
of 0.5 mM of isopropyl p-D-thiogalactoside (IPTG). 
After induction for 5h at 18°C, the cells were harvested 
by centrifugation at 4000g, 4°C for 30 min. The pellet 
was washed, frozen, and then disrupted by sonication 
against Buffer A (20 mM Tris-HCl, 0.5 M Nad, and 
5mM imidazole, pH 8.0). The lysed cells were centri¬ 
fuged at 14,000g at 4°C for 1 h. Keep the supernatant 
and discard the pellet. A 1-ml HiTrap Ni 2+ chelating 
column was equilibrated with 10 ml of sterile deionized 
water, 50 mM NiSCL, and finally 10 ml Buffer A. The 
supernatant was passed over the column at a flow rate of 
5 ml/min, followed by washing it with 20 ml Buffer A and 
20 ml Buffer B (80 mM imidazole in Buffer A), respec¬ 
tively. The protease of interest was eluted with 10 ml 
Buffer C (20 mM Tris-HCl, 0.5 M NaCl, and 0.5 M im¬ 
idazole, pH 8.0) and then purified further by gel filtration 
using a HiTrap 16/60 Sephacryl SI00 column pre-equil- 
ibrated with Buffer D (5mM dithiothreitol, 150mM 
NaCl, and 10 mM Tris-HCl, pH 7.5) through an FPLC 
system (Pharmacia). The highly purified His-tagged 
SARS_3CL pro with the yield of 15mg/L was obtained. 

In-gel digest and peptide extraction 

The protocol used for the in-gel digest in this study 
was modified according to the literature method de¬ 
scribed by Yu et al. [27]. The gel band of interest 
(SARS_3CL pro ) was exercised from the Coomassie- 
stained SDS-PAGE gel with a steel scalpel and destained 
in an Eppendorf tube by washing sequentially with 100 pi 
of 30% CHsCN/lOOmM ammonium bicarbonate. The 
washing step was repeated until the gel bands were clear. 
And then the gel band was completely dried by a Speed- 
Vac Vacuum centrifuge apparatus (Savant, Holbrook, 
NY) and cut into small pieces. The dried pieces were re¬ 
swollen by adding about 30 pi of 50 mM ammonium 
bicarbonate (pH 8.3). The volume added was to the 
minimum necessary to completely cover the gel pieces 
and then trypsin was added to the ratio of enzyme to 
sample in 1:20 (w/w). The gel pieces were incubated at 
37 °C for 12-16 h.The tryptic peptides were extracted by 
adding 30 pi solution containing 60% CH 3 CN/ 0 . 1 % TFA 
and vortexing for 4 min before removing the solution. 
This extraction step was performed three times with the 
same solution. The extraction solution was pooled 
together in a 0.5 ml Eppendorf tube and evaporated to 
10-20 pi by Speed-Vac Vacuum centrifuge apparatus. 

LC-ion trap-MS and MS/MS 

The LC/MS system used for analyzing tryptic pep¬ 
tides was a combination of HP 1100 (Agilent, Cheshire, 
UK) LC system with LCQ-DECA Mass Spectrometry 
(Thermofinnigan, San Jose, CA). A microbore reverse 


phase column (C 8 50 x 1.0 mm ID, 7 pm, ABI RP300) 
was used for LC separation. Solvent A was 0.1% FA in 
100% (v/v) water and solvent B was 0.1% FA in 100% (v/ 
v) CH 3 CN. The gradient started at 5% B, held for 2 min, 
and went linearly to 80% B in 50 min. The peptide mix¬ 
ture was injected into the column by an autosampler and 
separated at a flow rate of 200 pl/min. The fractions were 
detected by PDA (TSP UV6000) and directly introduced 
on-line into ESI source. The operating condition was 
optimized with standard solution provided by manu¬ 
factures and the working parameters of ion source were 
as follows: capillary temperature, 200 °C; spray voltage, 
5kV; capillary voltage, 15 V; and sheath gas flow rate, 
20 arb. To get more mass spectra within an LC peak, two 
types of scan modes, full scan and MS/MS (with data- 
dependent), were used for acquiring more data points. 
The scan mass range was from m/z 400 to m/z 2000 and 
the collision energy was set at 38%. 


Results and discussion 

Construction of the expression vector pQE30- 
SARS_3 CL pro 

The SARS_3CL pro PCR product verified by se¬ 
quencing was digested with BamWl and Hindlll, and 
then inserted into the BamHl and Hindlll sites of the 
vector pQE30. E. coli M15 cells transformed with the 
plasmid pQE30-SARS_3CL pro were used for the ex¬ 
pression of His-tagged SARS_3CL pro . 

Expression and purification of SARS_3CL pro 

Based on the optimization of the expression and 
purification method of SARS_3CL pro from E. coli , the 

97KD 
66KD 

45KD 


30KD 

20.1 KD 


Fig. 1. SDS-PAGE analysis of the purification of SARS_3CL pro (1, 
Marker; 2, supernatant; 3, pellet; 4, purified by NTA-Ni 2+ affinity 
column; and 5, after purified by FPLC system). 
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Table 1 

Purification Scheme of SARS_3CL pro expressed in E. coli transformed with pQE30-SARS_3CL pro (1 L culture) 


Step 

Total protein (mg) 

SARS_3CF pro (mg) 

Purification factor 

3CF pro yield (%) 

Extraction 

500 

50 

1 

100 

Ni 2+ -affinity column 

50 

40 

8 

80 

Gel filtration 

15 

15 

10 

30 


homogeneous protein was successfully isolated by two 
chromatographic steps. The SDS-PAGE analysis of the 
purification of SARS_3CL pro is shown in Fig. 1 and the 
purification scheme of SARS_3CL pro expressed in E. 
coli transformed with pQE30-SARS_3CL pro in 1 L cul¬ 
ture is listed in Table 1. 

From these results it can be seen that the use of 
pQE30-SARS_3CL pro plasmid and expression in E. coli 
Ml5 cell can produce a large amount of soluble 
SARS_3CL pro . The purification procedure is also easy 
to be handled. 


46.5), and it also shows that eleven peptides (Tl-Tll) 
were matched with tryptic peptides of 3CL pro (Fig. 2). 

The MS/MS spectrum of doubly charged precursor 
ion of T3 peptide at m/z 566.2 was displayed as an 


MRGSHHHHHHGSTM SGFRK M AFPS GK VEGCMV 0 VTCGTTTLN GLWLDDT 


VYCPR HVICTAEDMLNPNYEDLLI R KSNHSFLVOAGNVOLR VIGHSMONC 

LLR LKVDTSNPK TPKYKFVRIOPGOTFSVLACYNGSPSGVYOCAMRPNHTIK 


GSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGKFYGPFVDR 


LCIMS and LCIMS/MS analysis of SARS_3CL pro and 
tryptic peptides 

The result of data search using MS/MS raw data of 
tryptic peptides from gel band shows that the 3CF pro is 
the first candidate with a summary score of 456.5, which 
is much higher than that of the second candidate (score 


OTAQAAGTDTTITLNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKY 

NYEPLTODHVDILGPLSAOTGIAVLDMCAALKELLONGMNGR TILGSTILE 

DEFTPFDVVR OCSGVTFQK LN 

Scheme 1. Sequence of SARS_3CL pro showing the coverage of the 
protein obtained by mass spectrometry of in-gel tryptic digest. (Frag¬ 
ments of the sequence resolved by LC/MS/MS are shown in boldface 
and underline, and the expression tag is highlighted in a pane.) 



Tl I 
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t i - rV 1 rrr-i 



NL 
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Fig. 2. Base peak chromatogram of the peptides from FC/MS analysis of tryptic digests of SARS_3CF pro . 
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Fig. 3. MS/MS spectrum of the doubly charged precursor ion with m/z 566.2(T3) at a retention time of 10.25 min. A sequence is confirmed from the 
labeled b- and y-ions in the spectrum. Ions observed in the spectrum are underlined and assigned. 
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Fig. 4. Multiple charge ions of SARS_3CL pro protease. 
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Fig. 5. Molecular weight of SARS_3CL pro protease. 


example in Fig. 3. Most of the b and y ions were de¬ 
tected (>80%). The total number of amino acids con¬ 
tained within the eleven tryptic peptides was 145 
(Scheme 1), with the protein coverage at 46.4%. 

The molecular weight of 3CL pro was also determined 
by the LC-MS system. The LC condition was based on 
that above described for peptide separation. An LC 
peak at a retention time of 18 min was observed (data 
not shown). Mass spectrum corresponding to this LC 
peak gave a multiple charge ion (Fig. 4). The molecular 
weight was obtained by deconvolution algorithm in 
“sequest” program. The measured mass of 3CL pro 
protease is 35831 (Fig. 5) and the difference between 
measured and theoretical mass (mw.35832) was only 1 
dalton. These results completely determine the identity 
of the expressed SARS_3CL pro in this work. 

In conclusion, in this work we have succeeded in the 
molecular cloning of pQE30-SARS_3CL pro , and with 
this plasmid using E. coli as expression system a large 
amount of purified His-tagged SARS_3CL pro protease 
has been obtained by NTA-Ni 2+ affinity chromatogra¬ 
phy. The achieved protease may be surely used for 
screening its crystallized conditions for X-ray crystallo¬ 
graphic analysis. 
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